Formula 1 World Championship Data Analysis

Introduction and Problem Statement

Formula One (F1) is one of the most popular sports in the world. It is the highest class of international racing for single-seater formula racing cars. Formula One is sanctioned by the Fédération Internationale de l’Automobile (FIA) which was established on 20 June 1904. Formula One was inaugurated on 13 May 1950 as the World Drivers’ Championship at Silverstone in the United Kingdom. In 1981 it became known as the FIA Formula One World Championship.

Several races called Grand Prix are held all over the world over a season. These races taken together are called a Formula One season. The word ‘Formula” refers to a set of rules that all participating teams have to adhere to. Grand Prix is a French word that translates as grand prize in English. The races are run of tracks that are graded “1” by the FIA. Hence the name Formula One was adopted.

The races take place on purpose-built tracks certified by the FIA. Most tracks are situated in remote locations well connected with cities. There are a few races such as the British Grand Prix and the Singapore Grand Prix that are held on closed public roads. Formula One is one of the premium forms of racing around the world and draws huge audiences.

A driver participating in a Formula One race should hold a valid Super Licence issued by the FIA. The performance of the drivers and the constructors of the car are evaluated at the end of each race by a points system. At the end of a season, the FIA aggregates the points scored by each and awards two annual World Championships: one each for the drivers and the constructors.

Formula 1 has fans all over the world and has been multi billion dollar business. The reason behind for us to pick this dataset is to bring some valuable insights from the dataset inorder to understand the performance of Formula 1 over the years and to extract key features.

Dataset URL: https://www.kaggle.com/datasets/rohanrao/formula-1-world-championship-1950-2020

From the given dataset, we are trying to focus on three main perspectives: Drivers, Constructors(Manufacturers) and Circuits and try to do in-depth analysis for each

Drivers

Formula 1 began in 1950 and since then many legendary drivers and teams have dazzled audiences around the world. From Juan Manuel Fangio to Alberto Ascari, Ayrton Senna to Michael Schumacher, Lewis Hamilton to Max Verstappen, F1 has crowned many champions over the years. In the world of Formula One (F1) racing, drivers play a crucial role as they are the ones who are behind the wheel and compete in races. The role of the driver in F1 is to control the car and achieve the best possible finish in each race.

To be successful in F1, a driver must possess a combination of physical and mental skills, including quick reflexes, endurance, and the ability to make split-second decisions under pressure. They must also have a deep understanding of their car and the track, as well as the ability to work closely with their team to make informed decisions about strategy, tire choice, and other key elements of the race.

Overall, the role of the driver in F1 is to push the limits of their car and their own abilities, and to compete at the highest level in the pursuit of victory.

To find Top 10 F1 drivers by number of wins till date

Drivers are super starts when it comes to Formula 1 and here we analyze who are the top drivers who has secured most race wins.

Lewis Hamilton and Michael Schumacher topped the list of F1 drivers’ all-time victories comparisons in the histogram below. There has also been a supplementary stacked bar chart comparison of the Top 10 drivers’ total races and victories. The graph suggests that there may be some correlation between a driver’s win total and the number of races in which they have competed.

To find historical driver nationality distribution since 1950 in the F1 championship.

We must go back to World War II and the successful aerial battles over the English Channel against the Germans in order to comprehend why F1 had the highest number of British drivers and champions. The British were compelled to construct enormous airfields as a result of the ongoing aerial conflict in order to protect against the Germans. These airfields were completely unusable following World War II and the fall of Nazi Germany until a group of British motor enthusiasts decided to transform them into imaginative race circuits. This quickly drew racing car drivers and engineers who had been working on intricate fighter jet engines during the war to create the best race vehicles and test them on the now-converted race tracks. One of the airfields later developed into The Silverstone Circuit, the “Mecca” of motorsport. Many F1 teams have established their headquarters in the area as a result of the flow of racing talent to Britain over the years. By 2022, six out of ten contractors will establish offices there.

To find most number of wins by a driver in a single season.

From the chart above, we can see that Max Verstappen holds the record of highest number of wins (15) in 2022 season. NExt, Vettel and Schumacher hold the record for the equal victories in a season combined. Both in the years 2004 and 2013, respectively, were completely omnipresent. Both Schumacher and Hamilton have compelling arguments for being the greatest F1 drivers of all time, therefore picking either one as the greatest F1 driver of all time wouldn’t be inappropriate.

Constructors

In Formula 1, constructors are the businesses, organizations, or producers in charge of designing and building the racing vehicles. Two racing drivers and a racing team from each constructor participate in each Grand Prix for World Championship points. A crucial component of the sport is the Formula 1 constructors. Without the constructor’s championship, no team would strive as hard for success as they do because the constructors are the teams that support the vehicles and the drivers.

Who are the Most successful constructors ?

The bar graph shows top 10 constructors successful constructors based on the total points earned. Ferrari, Mercedes, Redbull, McLaren turn out to be the most successful constructors with more than 5000 points.

Most successful Constructor over last decade (Top 4 Constructor points comparison)

The graphic shows that Mercedes has regularly dominated the points, but in recent years, the other teams have begun to catch up and challenge them. It appears like Mercedes’ golden age is about to come to an end, or at the very least, they will have to battle hard to survive

Constructor wins by Origin Country

Bristish constructors has won more number of times (494 times) as compared to other nationalities. Constructors from Irish, Canadian and Japanese nationalities have won the lowest number of times

Max Speed Dirstribution by Constructors

Even though the maximum speed distribution among all constructors is essentially the same, even a slight increase in speed can have a significant impact in this sport. Since they dominated the seasons based on speed, the Mercedes vehicles achieved the highest speed, which is also compatible with the cars having a better build and also providing better outcomes.

Circuits

Altitude dependency on engine failure

In higher altitudes, air density is thinner; therefore, less air passes through the radiators and intake valves to cool down the brakes and engine. Also, engines need oxygen to instantiate combustion, a lack of which leads to performance loss in a car. The primary data points for such scenarios are overheating of transmission and engine components

name altitude NUMBER_OF_ENGINE_FAILURES
Autódromo Hermanos Rodríguez 2227 8
Autodromo Nazionale di Monza 162 7
Marina Bay Street Circuit 18 6
Red Bull Ring 678 6
Albert Park Grand Prix Circuit 10 4
Bahrain International Circuit 7 4
Yas Marina Circuit 3 4
Circuit de Barcelona-Catalunya 109 3
Circuit de Spa-Francorchamps 401 3
Circuit of the Americas 161 3

From the above choropleth map, we can identify distribution of circuits around the globe and as expected,the Mexican circuit poses the highest overheating issues, due to which cars had to retire, followed by the Red Bull Ring. But Bahrain GP, close to sea level, sees the same amount of trouble. This may be due to high track temperatures because of the geographic location.

Which circuits has the Fastest LapSpeed Record?

Each circuit has its one features and parameters like altitude and the design of the circuit influnces performance of the drivers. Here we try to figure out Fastest Lap Speed of top Circuits used in F1

We can see that the Monza circuit in Italy has the Fastest Lap speed of 257.3 km/h by Rubens Barichello on 12 September 2004 when he was with Ferrari team he made this achievement. This circuit has a moderate altitude level when compared to others and the driver has achieved this fleat. Following it is the Silverstone circuit in Uk with 244 km/h fast speed. Bahrain International Circuit has record of 230 km/h even though it is located very close to sea level.

How is the performance of drivers been in each circuit?

From the above spider charts, we can infer that Lewis Hamilton has an even records of wins across the circuits with high of 17 wins in silverstone circuit whereas the other drivers have distributed records across the circuits. Ruben Barrichello has a recordhigh of 19 wins in “catalunya” circuit making it the highest record of wins in a particular circuit. Using Spider chart is an optimal wins to understand the data here for the variation in the distribution of number of wins.

Conclusion

Learnt working with dataframes in R and discovered functionality of various functions from different libaries. Learnt dealing with missing values, manipulating, filtering, rearranging the data, calculating mean values for the columns in the dataframe as well as performing string and date operations. Got to know the working of group_by(), summarize(), and many other functions in order to work upon multiple columns from the dataset. Learnt working with spread() and converted long dataset to wide format. Answering the given questions helped me unearth hidden facts from the dataset.

This assignment also gave us good exposure to domain knowledge of Formula 1 which was relatively new to us. We learnt more of the terminologies used in it and also some key aspects of the sport.The study presented above demonstrates that Mercedes dominated F1 from 2013 to 2019. This is also consistent with the top driver of all time, Lewis Hamilton, who has 99 Grand Prix Championships. Red Bull is the team that appears to be catching up to Mercedes and giving them a run for their money in recent years. One of the primary concerns I intended to answer with this study was whether there is a relationship between races and victories. This requires more research and a more complete analysis that includes other variables that could influence the number of wins. Although it appears that both the Top 2 drivers, L. Hamilton and M. Schumacher, were able to win more races due to the number of races they participated in, this is not consistent with a few other drivers, such as F.Alonso, who, despite being in the Top 10, has a large gap between his number of races and number of wins.

Overall, the analysis of the F1 World Championship in R provides a comprehensive and in-depth look at one of the world’s most exciting and prestigious motorsports events. Whether you are a fan of the sport or simply interested in using data analysis to gain insights into complex systems, the F1 World Championship is a fascinating and endlessly rewarding subject for analysis.

References

  1. https://www.kaggle.com/datasets/rohanrao/formula-1-world-championship-1950-2020
  2. https://rpubs.com/alisial/formula1-analysis
  3. https://plotly.com/r/gauge-charts/
  4. https://ggplot2.tidyverse.org/reference/geom_polygon.html
  5. https://htmlcolorcodes.com/